To gain access in theory you just need a terminal. However, the following guide uses the software Visual Studio Code, which can be downloaded from here (https://code.visualstudio.com/), as it makes the operation simpler.
Once you have installed it, also install the remote extension (under the Extensions or View panel).
Open VScode
Click on “Extensions”
Install “Remote”
Copy the id_rsa and config files you received by us to the .ssh directory in your home folder. If you already have an id_rsa and/or config file in your .ssh directory, do not overwrite them. Contact edoardo.piombo@umu.se and I will explain you how to proceed.
If you do not see any “.ssh” in the home folder of your computer, activate the option to show hidden files (it changes depending on the operative system).
If you still do not see it after this, then make it yourself as a new folder and then move the “id_rsa” and “config” files there.
If you cannot find out how to show hidden files on your computer, you can do it from Visual Studio Code, going into settings and removing “**/.ssh” from the exclude list. After that, you can find your “.ssh” folder by doing “Open folder” and selecting your home directory. To copy the files in your “.ssh” directory, simply drag them there in Visual Studio Code
Open VScode settings
Remove “**/.ssh” from the list of excluded files
Open your home folder
Drag the files to your .ssh folder
You can now close the current window of Visual Studio Code now, the setup is done!
Note: This connection method will not work unless you are at the university or you have the university VPN active.
Open a new Visual Studio Code window. Open the remote extension by clicking on the double arrows on the bottom left, then click on “Connect to host”.
You should see “micro” in the list of available hosts, connect to that.
If asked to choose an operating system, select “Linux”
A new window will open, connecting you to the server. You can now do “open folder” and accept the default pathway to visualize the content of your home directory. Never try to visualise the general home directory, as that will crash the server.
Once you have opened your home directory, select “terminal -> new terminal” from the upper menu to open a text interface.
If you are involved in more than one project, in the future, when doing “open folder”, please only open the folder relevant to the project you want to work on, in order to make the session as light as possible.
You can now work on the server :)
If you wish, you can change your password with the following command:
New passwords must have min 12 characters with digits, uppercase letters and symbols.
The server uses bash commands, and an excellent tutorial can be found here:
Our server uses a workload manager called slurm. All heavy jobs (read: jobs that do more than moving files around , make new folders and visualize or modify text files) should be run through slurm.
A quick start guide can be found here: https://docs.uppmax.uu.se/cluster_guides/slurm/
Here you find the manual, pay close attention to the commands: sbatch, srun, scancel and squeue, as they are the ones you will use more often: https://slurm.schedmd.com/documentation.html
To run jobs using slurm, you will need to have a project code that you will specify with the -A option. Your supervisor should have received such a project code from Nico and will be able to share it with you.
This video contains good information to start, as well as basics of slurm and github: https://www.youtube.com/watch?v=3XMHTixiszE
In general, some templates to run many bioinformatic softwares, or do some common R analysis, can be downloaded from github:
https://github.com/nicolasDelhomme/project-template
Download them to a folder on the server with this command:
All the templates are in UPSCb-common/templates or UPSCb-common/pipeline
We have singularity containers set up for many common bioinformatic softwares. They are at pathway /mnt/picea/storage/singularity/ or /mnt/picea/storage/singularity/kogia/, which can be linked to your current directory through a symbolic link and accessed through the command singularity exec. Example:
We have a server running Posit Pro, called AspSeq, from which you will be able to run R code directly on the files present in the server.
You can access it at https://aspseq.upsc.se:8080/s/57ea13c286bd33c286bd3/workspaces/
However, please do not use it unless necessary, as our license supports a limited number of users. That is, use it if you need to do data analysis, visualization or other operations that benefit from using Posit, but do not access it if you just need to explore folders, move or copy files, write sh files to submit with sbatch, or other similar operations that you could do with Visual Studio Code.
AspSeq has a limited quantity of memory available, and for this reason it is necessary to keep an eye on the available memory under the Environment tab while you work.
It is better to avoid loading enormous datasets on AspSeq (for example sam files), and it is similarly suggested to not perform heavy operations. In general, this service is mostly there to allow you to perform data exploration and visualization using the R commands you know, without having to worry about exporting files from the server to your computer. It is not there to allow you to run computationally heavy operations, which are supposed to be done through slurm.
When the memory is more than half full it is better to not work on AspSeq at all, as if it crashes we will have to kill all sessions and everybody will lose any work not saved in a file.
A good policty to save memory is to regularly remove variables from your environment if you do not need them. AspSeq will save a session and allow you to keep working on it the next time you connect, and it is a big waste of space if you never clean the environment and end up keeping a dataframe you do not need loaded there for 10 months.
Sometimes you actually need lots of memory to successfully run an R script. No problem, just do not run it on AspSeq!
It is possible to send an R script for execution using slurm, using the command Rscript .
First, save your R script to and R file, and make sure that it exports its results to a file.
Then, write an sh file like the following one, which contains the instructions to execute the file my_script.R:
#!/bin/bash
#SBATCH --account=YOUR_PROJECT_CODE
#SBATCH --ntasks=1
#SBATCH --time=10:00:00
#SBATCH --mem=10G
R=/mnt/picea/home/singularity/R-4.4.3.sif
Rlib=/mnt/picea/home/rstudio/Modules/apps/compilers/R/4.4.3/lib/R/library:/usr/local/lib/R/library
export R_LIBS=/mnt/picea/home/rstudio/Modules/apps/compilers/R/4.4.3/lib/R/library
apptainer exec -e -B /mnt:/mnt -B $Rlib $R Rscript \
--vanilla my_script.ROnce you have saved this file, you can run it with slurm using the command sbatch
When the job is done, you can then import the resulting file (normally a dataframe) with AspSeq and proceed to visualization and other operations that do not require a lot of memory.